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Claim Rejections - 35 USC §112 



1 . The following is a quotation of the first paragraph of 35 U.S.C. 112: 

The specification shall contain a written description of the invention, and of the manner and process of 
making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
art to which it pertains, or with which it is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the inventor of carrying out his invention. 

2. Claims 32, 34, 47, 54, 64 are rejected under 35 U.S.C. 112, first paragraph, as 
failing to comply with the enablement requirement. The claim(s) contains subject matter 
which was not described in the specification in such a way as to enable one skilled in 
the art to which it pertains, or with which it is most nearly connected, to make and/or use 
the invention. For example claim 32 recites plurality of microphones are arranged in an 
n-fire configuration in the video conferencing bar. The specification does not explain 
what the an n-fire configuration is. Claim 34 recites two side bars having plurality of 
microphones and speakers, wherein the two side bars are vertical and are operable to 
be placed on the two sides of the video display. There is no support for this limitation in 
the applicant's specification. Claim 47 recites the position signal indicates an angle 
between the audio source and the remote videoconferencing system. Applicant's 
specification does not explain what it means, neither there is any support for it in the 
specification. Claim 54 and 64 are similar to claim 47. 

Claim Rejections -35 USC §102 
1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 
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(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351 (a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 

2. Claims 59-62, 67-68, 69-71 , and 76, are rejected under 35 U.S.C 102(e) as being 
anticipated by Westfield (US PAT: 6,779,979, filed 6-12-2001). 

Regarding claim 59, Westfield discloses a method of operating a processing unit 
for a local video conference system, the processing unit being coupled to a display, the 
method comprising: receiving at the processing unit (510, fig. 5) first and second video 
streams from a remote videoconferencing system, wherein first and second video 
streams comprises different areas derived from images of an area recorded at a video 
camera (410/420, fig. 4) at the remote videoconferencing system, and display on the 
display at least the first and second video streams (fig. 8B, col. 5, line 24 - col. 6, line 
33). 

Regarding claim 69, Westfield discloses a method of operating a processing unit 
for a local video conference system, the processing unit being coupled to a video 
camera, the method comprising: receiving at the video camera images of an area (fig. 
4), and sending the images to the processing unit (col. 5 lines 18-23), the processing 
unit (510, fig. 5), generating at least first and second video streams from the images, 
wherein at least the second video stream comprises a subset of the area, and 
transmitting the first and second video streams to a remote videoconferencing system 
(fig. 8B, col. 5, line 24 - col. 6, line 33). 
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Regarding claims 60-62, 67-68, 70-71, and 76, Westfield further teaches the 
following: first video stream comprises the-entirety of the area (fig. 4), wherein both the 
first and second video streams comprises subsets of the area, second sub stream 
comprises a subset of the area, video camera (410/420, fig. 4) is fixed (fig. 8B, col. 5, 
line 24 - col. 6, line 33), second video stream is displayed on the display within the first 
video stream (fig. 8B). 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

4. Claims 63-66, 72-74 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Westfield in view of Nakamura (JP1 0-042264). 

Regarding claims 63-66, 72-74, Westfield does not teach the following: subset of 
the area comprises an area around an acoustic source at the remote videoconferencing 
system, determining the position of the acoustic source relative to the local/remote 
videoconferencing system, position is determined by through the interaction between 
the audio signal from the acoustic source and plurality of microphones at the remote 
videoconferencing unit, receiving at the processing unit the audio signal and the 
position. 
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However, Nakamura discloses a videoconferencing system which teaches the 
following: subset of the area comprises an area around an acoustic source at the 
remote videoconferencing system, determining the position of the acoustic source 
relative to the local/remote videoconferencing system, position is determined by through 
the interaction between the audio signal from the acoustic source and plurality of 
microphones (3/4, Drawing 1) at the remote videoconferencing unit, receiving at the 
processing unit (5, Drawing 1) the audio signal and the position (paragraphs: 0007, 
0022-0028). 

Thus, it would have been obvious to one of ordinary skill in the art at the time 
invention was made to modify Westfield's system to provide for the following: subset of 
the area comprises an area around an acoustic source at the remote videoconferencing 
system, determining the position of the acoustic source relative to the local/remote 
videoconferencing system, position js determined by through the interaction between 
the audio signal from the acoustic source and plurality of microphones at the remote 
videoconferencing unit, receiving at the processing unit the audio signal and the position 
as this arrangement would facilitate automatic tracking of the speaker in a 
videoconference system as taught by Nakamura (see claim 6), thus facilitating 
automatic camera movement to track the speaker. 

5. Claim 75 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Westfield in view of Addeo et al. (US PAT: 5,335,01 1 , hereinafter Addeo). 

Regarding claim 75, Westfield does not teach the following: transmitting the 
audio signal and the position signal to the remote videoconferencing system. 
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However, Addeo discloses a sound localization system for teleconferencing 
using self-steering microphone arrays which teaches the following: transmitting the 
audio signal and the position signal are transmitted to the remote conferencing system 
(fig. 2 col. 5 lines 23-39). 

Thus, it would have been obvious to one of ordinary skill in the art at the time 
invention was made to modify Westfield's system to provide for the following: 
transmitting the audio signal and the position signal are transmitted to the remote 
videoconferencing system as this arrangement would facilitate create more realistic 
sound corresponding to the video as taught by Addeo (col. 5, line 66 - col. 7, line 4), 
thus creating ambience in the videoconference system. 

6. Claims are 23, 29, 31-33, 35-37, 38, 42, 44, are rejected under 35 U.S.C. 103(a) 
as being unpatentable over Nakamura in view of Addeo. 

Regarding claim 23, Nakamura discloses a local videoconferencing device for 
videoconferencing having a local videoconferencing device with a video display and a 
remote videoconferencing device with a video display interconnected through the 
network, the local videoconferencing device comprising: a videoconferencing device 
bar, wherein videoconferencing bar comprises a video sensor (2, Drawings: 1, 3) for 
capturing images, a plurality of microphones (3/4, Drawings: 1, 3) for capturing sound, 
and a plurality of speakers (3/4, Drawings: 1 , 3) for producing sound (paragraph: 0009, 
0015, 0036), wherein video sensors, the microphones and speakers are arranged in 
fixed positions in the videoconferencing bar (Drawing: 3), a processing unit (1/5, 
Drawing 1) coupled to the videoconferencing bar, a communication interface (not 
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shown) coupled to the processing unit and other remote videoconferencing devices 
through the network (paragraph: 0028), wherein the processing unit is operative to 
produce at least one of a first video stream from the signals received from the video 
sensor and an audio stream and audio source position signal from signals received from 
the microphones, wherein the processing unit is operative to receive at least one video 
stream, one audio stream (3, Drawings: 1, 3, paragraphs: 29-0035). 

Nakamura differs from claim 23 in that he does not teach the following: receiving 
audio source position signal from a remote videoconferencing device, and wherein the 
processing unit is operative to drive plurality of speakers to reproduce sound according 
to the received audio stream and audio source position signal. 

However, Addeo teaches the following: receiving audio source position signal 
from a remote videoconferencing device, and wherein the processing unit is operative to 
drive plurality of speakers to reproduce sound according to the received audio stream 
and audio source position signal (fig. 2 col. 5 lines 23-39, col. 5, line 66 - col. 7, line 4). 

Thus, it would have been obvious to one of ordinary skill in the art at the time 
invention was made to modify Nakamura's system to provide for the following: receiving 
audio source position signal from a remote videoconferencing device, and wherein the 
processing unit is operative to drive plurality of speakers to reproduce sound according 
to the received audio stream and audio source position signal as this arrangement 
would facilitate create more realistic sound corresponding to the video as taught by 
Addeo, thus creating ambience in the videoconference system. 
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Regarding claim 38, Nakamura discloses a method for videoconferencing, 
wherein plurality of videoconferencing devices are connected through a network, 
wherein each videoconferencing device comprises a videoconferencing bar having a 
video sensor, a plurality of microphones and speakers, a processing unit, a video 
display and a network interface, the method comprising: capturing video images with 
the video sensor in the videoconferencing bar, capturing audio signals with 
microphones (3/4, Drawings: 1, 3) in the videoconferencing bar (paragraphs: 0009, 
0015, Drawings: 1, 3), receiving video images and audio signals at the processing unit 
(1/5, Drawing 1), generating first video stream from the video images and an audio 
stream and an audio position signal from the audio signals, transmitting audio stream 
and video stream to a remote conferencing device, displaying the first video stream on a 
video display at the remote conferencing device (paragraphs: 0028-0036). 

Nakamura differs from claim 38 in that he does not teach the following: 
transmitting audio position signal to a remote conferencing device, and driving the 
speakers at the remote conferencing device to reproduce sound according to the audio 
stream and the audio position signal. 

However, Addedo teaches the following: transmitting audio position signal to a 
remote conferencing device, and driving the speakers at the remote conferencing 
device to reproduce sound according to the audio stream and the audio position signal 
(fig. 2 col. 5 lines 23-39, col. 5, line 66 - col. 7, line 4). 

Thus, it would have been obvious to one of ordinary skill in the art at the 
time invention was made to modify Nakamura's system to provide for the following: 
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transmitting audio position signal to a remote conferencing device, and driving the 
speakers at the remote conferencing device to reproduce sound according to the audio 
stream and the audio position signal as this arrangement would facilitate create more 
realistic sound corresponding to the video as taught by Addeo, thus creating ambience 
in the videoconference system. 

Regarding claims 29, 32-33, 35-37, 42, Nakamura teaches the following: 
processing unit (1/5, Drawing 1) is operative to generate the position signal based on 
magnitude difference of the audio signals received from the plurality of microphones 
(3/4, Drawings: 1 , 3, paragraphs; 0022-0023), plurality of microphones are arranged in a 
n-fire configuration in the videoconferencing bar, wherein videoconferencing bar is 
horizontal and operable to be placed on top of a video display (paragraph: 0009 
Drawings: 1, 3), video sensor (2, Drawings: 1, 3, has a vide viewing angle, viewing 
angle is 65 degrees, further comprising a pan motor to increase the viewing angle of the 
video sensor (claims 6-7). 

Nakamura differs from claims 31 , 44 in that he does not teach the following: 
processing unit is operative to drive the plurality of speakers to reproduce sound 
according to the received audio signal and audio source position signal by selectively 
driving one or more speakers in response to received position signal from the 
videoconferencing device to play the audio signal corresponding to the image of the at 
least one video stream. 

However, Addedo teaches the following: processing unit is operative to drive the 
plurality of speakers to reproduce sound according to the received audio signal and 
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audio source position signal by selectively driving one or more speakers in response to 
received position signal from the videoconferencing device to play the audio signal 
corresponding to the image of the at least one video stream (fig. 2 col. 5 lines 23-39, 
col. 5, line 66 - col. 7, line 4). 

Thus, it would have been obvious to one of ordinary skill in the art at the 
time invention was made to modify Nakamura's system to provide for the following: 
processing unit is operative to drive the plurality of speakers to reproduce sound 
according to the received audio signal and audio source position signal by selectively 
driving one or more speakers in response to received position signal from the 
videoconferencing device to play the audio signal corresponding to the image of the at 
least one video stream as this arrangement would facilitate create more realistic sound 
corresponding to the video as taught by Addeo, thus creating ambience in the 
videoconference system. 

7. Claims 24-28, 39-41 , are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Nakamura in view of Addeo as applied to claims 23 and 38 above, and further in 
view of Westfield. 

Regarding claims 24-27, 39-41 , the combination does not teach the following: 
video sensor is operative to produce high resolution video stream, wherein first video 
stream is of a first resolution, wherein processing unit is operative to produce a second 
video stream, and wherein second video stream is of a second resolution and is 
representing an area in the first video stream, wherein the first resolution of first video 
stream is 700x400 pixels, and wherein second resolution of the second video stream is 
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300x200 pixels, wherein the maximum resolution of the video sensor is 3000x2000 
pixels, wherein the second video stream represents the images of a speaking 
videoconference participant. 

However, Westfield teaches the following: video sensor is operative to produce 
high resolution video stream, wherein first video stream is of a first resolution, wherein 
processing unit is operative a second video stream, and wherein second video stream is 
of a second resolution and is representing an area in the first video stream, wherein the 
first resolution first video stream is 2048x1526 pixels, and wherein second resolution of 
the second video stream is 640x480 pixels, wherein the maximum resolution of the 
video sensor is 2048x1526 pixels (col. 4 lines 52-59), wherein the second video stream 
represents the images of a speaking videoconference participant (col. 5, line 24 - col. 6, 
line 33). 

Thus, it would have been obvious to one of ordinary skill in the art at the time 
invention was made to modify the combination to provide for the following: the 
combination does not teach the following: video sensor is operative to produce high 
resolution video stream, wherein first video stream is of a first resolution, wherein 
processing unit is operative to produce a second video stream, and wherein second 
video stream is of a second resolution and is representing an area in the first video 
stream, wherein the first resolution of first video stream is 700x400 pixels, and wherein 
second resolution of the second video stream is 300x200 pixels, wherein the maximum 
resolution of the video sensor is 3000x2000 pixels, wherein the second video stream 
represents the images of a speaking videoconference participant as this arrangement 
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would provide necessary processing to meet the application requirements for intended 
use as shown by Westfileld. 

Regarding claim 28, the combination teaches the following: second video stream 
follows the speaking videoconference participant and changes when the speaking 
videoconference participant changes (paragraphs: 0028 - 0035 of Nakamura) 
8. Claim 34 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Nakamura in view of Addeo as applied to claim 33 above, and further in view of 
Wallace, Jr. (US PAT: 4,31 1 ,874, hereinafter Wallace). 

The combination differs from claim 34 in that although it teaches having a 
horizontal side bar having plurality of microphones and speakers Drawings. 1, 3, of 
Nakamura) ; it does not teach the following: two side bars having plurality of 
microphones and speakers, where the two side bars are vertical and are operable to be 
placed on two sides of the display. 

However, Wallace discloses teleconference microphone arrays which teaches 
disposition of plurality of microphones in a vertical plane (fig. 3, col. 3 lines 32-44). 

Thus, it would have been obvious to one of ordinary skill in the art at the time 
invention was made to modify the combination to provide for the following: two side bars 
having plurality of microphones and speakers, where the two side bars are vertical and 
are operable to be placed on two sides of the display as this arrangement would provide 
one of the ways, among many possible ways, of arranging audio sensors to meet the 
application needs of a conference as taught by Wallace. 



Application/Control Number: 10/814,364 Page 13 

Art Unit: 2643 

9. Claims 30 and 43 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Nakamura in view of Addeo as applied to claims 23 and 38 above, and further in 
view of Simms, Jr. (US PAT: 3,618,035, hereinafter Simms). 

Regarding claims 30 and 43, the combination does not teach the following: 
processing unit is operative to synchronize the phases of the signals from the video 
sensor and a video stream output by a remote videoconference device for display on a 
remote video display. 

However, Simms teaches the following: a method of synchronizing the phase of 
the video signals transmitted to video display apparatus (claim 11). 

Thus, it would have been obvious to one of ordinary skill in the art at the time 
invention was made to modify the combination to provide for the following: processing 
unit is operative to synchronize the phases of the signals from the video sensor and a 
video stream output by a remote videoconference device for display on a remote video 
display as this arrangement would facilitate displaying received signals properly. 

10. Claims 45-51 and 52-57 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Addeo 

Regarding claim 45, Addeo discloses a method of operating a processing unit for 
local videoconference system, the processing unit controlling plurality of speakers, the 
method comprising: receiving at the processing unit (140, fig. 2) from a remote 
videoconferencing system a position signal and an audio signal from an audio source, 
wherein the position signal is indicative of a position of the audio source relative to the 
remote videoconferencing system, and selectively driving at least one of the plurality of 
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speakers (130, fig. 2) in accordance with the position signal to broadcast audio signal, 
wherein the driven speakers are indicative of the position of the audio source relative to 
the remote videoconferencing system (fig. 2, col. 4, line 23 - col. 6, line 1). 

Regarding claim 52, Addeo discloses a method of operating a processing unit for 
local videoconference system, the processing unit receiving input from a plurality of 
microphones, the method comprising: receiving an audio signal from an audio source at 
the plurality of microphones (150, fig. 2), each microphone generating a microphone 
signal, generating a position signal from the microphone signals indicative of a position 
of the audio source relative to the local videoconferencing system, and transmitting the 
audio signal and the position signal to a remote videoconferencing unit (A, fig. 2, line 23 
-col. 6, line 1). 

Regarding claims 46-51, 53-57, Addeo further teaches the following: audio 
source comprises a videoconference participant (for example 201, fig. 2), the position 
signal indicates an angle between the audio source and the remote videoconferencing 
system(implicit, as the reference teaches control device 140 that regenerates the audio 
signal received from the station B in a manner such that the sound is perceived as 
emanating from the virtual point 121, col. 5, line 66 - col. 6, line 1), only one speaker 
(130, fig. 2) is driven, speakers are positioned in a linear array (fig. 2), position signal is 
derived at the remote videoconferencing system from the microphone signals generated 
by plurality of microphones (150, fig. 2), generating position signal comprises assessing 
the maginitude of the microphone signals (reads on determining volume zone of the 
audio sound, col. 5 lines 34-36), microphones are positioned in a linear array (150, fig. 
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2), position signal is used to at the videoconferencing system to selectively drive at least 
one of the plurality of speakers (130, fig. 2) in accordance with the position signal to 
broadcast the audio signal, wherein the driven speakers are indicative of the audio 
source relative to the remote videoconferencing system (col. 5 lines 36-39, lines 52-68, 
col. 6 line 1). 

11. Claim 58 is rejected under 35 U.S.C. 103(a) as being unpatentable over Addeo in 
view of Nakamura. 

Regarding claim 58, Addeo does not explicitly teach the following: microphones 
and speakers are both positioned in a linear array. 

However, Nakamura teaches the following: microphones and speakers are both 
positioned in a linear array (paragraph: 0009, and Drawings: 1, 3). 

Thus, it would have been obvious to one of ordinary skill in the art at the time 
invention was made to modify Addeo to provide for the following: microphones and 
speakers are both positioned in a linear array as this arrangement would provide 
compact means for arranging the audio sensors as shown by Nakamura. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Melur Ramakrishnaiah whose telephone number is 
(571)272-8098. The examiner can normally be reached on 9 Hr schedule. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Curt Kuntz can be reached on (571 ) 272-7499. The fax phone number for 
the organization where this application or proceeding is assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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